The Microsoft Relational Engine

نویسنده

  • Goetz Graefe
چکیده

Microsoft offers three very successful database products, FoxPro, Access and SQL Server. While SQL Server excels in multi-user transaction performance, Access and its underlying Jet engine excel as end-user and development tool for desktop and client-server applications. One of our top priorities is to improve the integration of these two products. With respect to their query processing capabilities, we plan on combining the strengths of SQL Server with those of Access. SQL Server's strengths are focused on management of very large tables, server-side cursors, and the use of stored procedures as scripts and as triggers. Access' strengths are queries over multiple servers, updatable query results, and bit-mapped processing. In our next generation of products, SQL Server will employ new query processing technology. Both optimization and execution will be based on an extensible set of operators. At this time, we are focusing on the relational algebra augmented with a few operators such as the "top" operator found in Access, and suitable loops-, index-, sort-, hash-, and bitmap-based execution algorithms. Moreover, we are planning on executing these algorithms both sequentially and in parallel. We have four design goals, namely functionality, performance, scalability, and extensibility. In the fhst release of the new technology, we focus on matching and surpassing existing query functionality of the two products. By providing a variety of algorithms and choosing among them very carefully, we hope to achieve excellent performance. Scalability from desktops to superservers in our goal; rather than essentially recompiling existing code on new platforms, we are re-designing and re-implementing a substantial component of our product. Finally, by architecting the new code base for extensibility, we not only integrate the latest query technology today but enable rapid innovation for years to come. R is really the last two goals that require the careful yet radical re-design of the query component that we are currently undertaking. The algebraic optimizer is based on the Cascades optimizer framework first developed outside of Microsoft in collaboration with Tandem. It employs a rule-based rewrite engine, guided by anticipated execution costs, heuristic guidance, branch-and-bound pruning, and heuristic pruning. By use of heuristics, the search engine integrates into one what is called query rewriting, join optimization, and algorithm selection. The information or properties carried up and down a query tree, in addition to costs, includes cardinality (captured as minimum, maximum, and expected values), degree (number of columns), column information (type, constraints, histograms), relational keys and functional dependencies, referential integrity constraints, sort order, columns present in memory, location, and partitioning. The information is divided into logical information, which does not change through an equivalence transformation of the query, and physical information, which is specific to a query plan. Query operators, rewrite rules, properties, and costs are all modelled as classes that can easily be augmented with new subclasses and new instances. The algebraic execution engine employs a uniform, demand-driven interface between algorithms, which we call iterators and which may be divided into work, enforcer, and control operations. For example, a mergeor hash-join truly modifies the data content, whereas a sort operation (without aggregation, duplicate removal, top, etc.) does not modify the data but enforces a particular representation. Special enforcer operations translate back and forth between bitmap representations and typical iterators over record sets. Control operations typically schedule execution within a single query, e.g., an operation at the root of a common

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unifying Data and Domain Knowledge Using Virtual Views

The database community is on a constant quest for better integration of data management and knowledge management. Recently, with the increasing use of ontology in various applications, the quest has become more concrete and urgent. However, manipulating knowledge along with relational data in DBMSs is not a trivial undertaking. In this paper, we introduce a novel, unified framework for managing...

متن کامل

Relational support for flexible schema scenarios

Efficient support for applications that deal with data heterogeneity, hierarchies and schema evolution is an important challenge for relational engines. In this paper we show how this flexibility can be handled in Microsoft SQL Server. For this purpose, the engine has been equipped in an integrated package of relational extensions. The package includes sparse storage, column set operations, fil...

متن کامل

Microsoft English Query 7.5: Automatic Extraction of Semantics from Relational Databases and OLAP Cubes

1. What is English Query? Microsoft English Query (EQ) lets users pose database queries in plain English. To do this, a developer need only define the database semantics, in effect building a conceptual model of the database. EQ provides an Authoring Tool that allows the developer to define the set of the entities and relationships in the database along with the database objects that the entiti...

متن کامل

The Camaleon Web Wrapper Engine

The web is rapidly becoming the universal repository of information. A major challenge is the ability to support the effective flow of information among the sources and services on the web and their interconnection with legacy systems that were designed to operate with traditional relational databases. This paper describes a technology and infrastructure to address these needs, based on the des...

متن کامل

Fast Foreign-Key Detection in Microsoft SQL Server PowerPivot for Excel

Microsoft SQL Server PowerPivot for Excel, or PowerPivot for short, is an in-memory business intelligence (BI) engine that enables Excel users to interactively create pivot tables over large data sets imported from sources such as relational databases, text files and web data feeds. Unlike traditional pivot tables in Excel that are defined on a single table, PowerPivot allows analysis over mult...

متن کامل

A Structured Text ADT for Object-Relational Databases

There is a growing need, both for use within corporate intranets and within the rapidly evolving World Wide Web, to develop tools that are able to retrieve relevant textual information rapidly, to present textual information in a meaningful way, and to integrate textual information with related data retrieved from other sources. This paper introduces a model for structured text and presents a s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996